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Abstract 

A widely used method for determining the similarity of two labeled trees is to compute a 
maximum agreement subtree of the two trees. Previous work on this similarity measure is only 
concerned with the comparison of labeled trees of two special kinds, namely, uniformly labeled 
trees (i.e., trees with all their nodes labeled by the same symbol) and evolutionary trees (i.e., 
leaf-labeled trees with distinct symbols for distinct leaves). This paper presents an algorithm 
for comparing trees that are labeled in an arbitrary manner. In addition to this generality, this 
algorithm is faster than the previous algorithms. 

Another contribution of this paper is on maximum weight bipartite matchings. We show how 
to speed up the best known matching algorithms when the input graphs are node-unbalanced 
or weight-unbalanced. Based on these enhancements, we obtain an efficient algorithm for a new 
matching problem called the hierarchical bipartite matching problem, which is at the core of our 
maximum agreement subtree algorithm. 



1 Introduction 

A labeled tree is a rooted tree with an arbitrary subset of nodes labeled with symbols. In recent 
years, many algorithms for comparing such trees have been developed for diverse application areas 
including biology [||, 19, 23], chemistry linguistics || |2l|], computer vision pl|, and structured 



text databases [16, 17~ [2C|1 . 



A widely used measure of the similarity of two labeled trees is the notion of a maximum 
agreement subtree defined as follows. A labeled tree R is a label-preserving homeomorphic subtree 
of another labeled tree T if there exists a one-to-one mapping / from the nodes of R to those of 
T such that for any nodes u,v,w of R, (1) u and f(u) have the same label; and (2) w is the least 
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common ancestor of u and v if and only if f(w) is the least common ancestor of f(u) and f(v). 
Let T% and X2 be two labeled trees. An agreement subtree of T\ and T2 is a labeled tree which is 
also a label-preserving homeomorphic subtree of the two trees. A maximum agreement subtree is 
one which maximizes the number of labeled nodes. Let MAST(Ti, T2) denote the number of labeled 
nodes in a maximum agreement subtree of T\ and T2. 

In the literature, many algorithms for computing a maximum agreement subtree have been 
developed. These algorithms focus on the special cases where T\ and T2 are either (1) uniformly 
labeled trees, i.e., trees with all their nodes unlabeled, or equivalently, labeled with the same symbol 
or (2) evolutionary trees [13], i.e., leaf-labeled trees with distinct symbols for distinct leaves. 

We denote n as the number of nodes in the labeled trees T\ and T2, and d as the maximum 
degree of T\ and T2. For uniformly labeled trees T\ and T2, Chung ||] gave an algorithm to 
determine whether T\ is a label-preserving homeomorphic subtree of T2 using 0(n 2 5 ) time. Gupta 
and Nishimura JllJ gave an algorithm which actually computes a maximum agreement subtree 
of T\ and T2 in 0(n 2 ' 5 logn) time. For evolutionary trees, Steel and Warnow [24] gave the first 
polynomial-time algorithm for computing a maximum agreement subtree. Farach and Thorup 
jjj improved the time complexity from 0(n 4 ' 5 log n) to 0(n 15 log n). Faster algorithms for the 
case d = 0(1) were also discovered. The algorithm of Farach, Przytycka and Thorup || runs in 
O (\fd~n log 3 n) time, and that of Kao [14| takes 0(nd 2 log 2 n log d) time. Cole et al. |3j gave an 
0(n log n)-time algorithm for the case where T\ and T2 are binary trees. Przytycka |22|] attempted 
to generalize the algorithm of Cole et al. so that the degree-2 restriction could be removed with 
the running time being 0(\/dnlogn). 

For unrestricted labeled trees (i.e., trees where labels are not restricted to leaves and may not be 
distinct), little work has been reported, but they have applications in several contexts 
example, labeled trees are used to represent sentences in a structural text database |[(], |17|, 
querying such a database involves comparison of trees; an XML document can also be represented 
by a labeled tree [|J]. Instead of solving special cases, this paper gives an algorithm to compute 
mast(Ti,T2) where T\ and T2 are unrestricted labeled trees. As detailed below, our algorithm not 
only is more general but also uniformly improves or matches the previously best algorithms for 
subtree homeomorphism and evolutionary tree comparison. 

Let At 15 t 2 (or simply A when the context is clear) = J2 U £Ti J2vgt 2 $( u i v ) where 5(u,v) = 1 
if nodes u and v are labeled with the same symbol, and otherwise. Our algorithm computes 
mast(Ti, T 2 ) in 0(\/(iAlog ^2) time. Thus, if Ti and T 2 are uniformly labeled trees, then A < n 2 
and the time complexity of our algorithm is 0(^/dn 2 log ^5), which is faster than the Gupta- 
Nishimura algorithm |ll]] for any d. If T\ and T2 are evolutionary trees, then A < n and the time 
complexity of our algorithm is 0(\^dn log ^jr), which is better than the 0(Vdn log n) bound claimed 
by Przytycka [p22j| . In particular, our algorithm can attain the 0(n log n) bound for binary trees 0]. 
Also for general evolutionary trees, our algorithm runs in 0(n 15 ) time since \/cfn log ^ = 0(n 15 ) 
for any degree d. This is faster than the 0(n L5 logn) time of the Farach-Thorup algorithm [fjj. 

The efficiency achieved by our mast algorithm is based on improved algorithms for computing 
maximum weight matchings of bipartite graphs that satisfy some structural properties. Let G = 
(X, Y, E) be a bipartite graph with positive integer weights on its edges. Denote by n, m, N, 
and W the number of nodes, the number of edges, the maximum edge weight, and the total edge 
weight of G, respectively. The best known algorithm for computing maximum weight bipartite 
matchings was given by Gabow and Tarjan (!(]], which takes 0(yJnm\ognN) time. For some 
applications where the total edge weight is small (say, W = 0(m)), Kao et al. [lf|] gave a slightly 
faster algorithm that runs in 0(y/nW) time. Intuitively, a bipartite graph is node-unbalanced if 
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there are much fewer nodes on one side than the other. It is weight-unbalanced if its total weight is 
dominated by the edges incident to a few nodes; we call these nodes the dominating nodes. In this 
paper, we show how to enhance these two matching algorithms when the input graphs are either 
node-unbalanced or weight-unbalanced. 

The node-unbalanced property has many practical applications (see, e.g., ^M) and has been 
exploited to improve various graph algorithms. For example, Ahuja et al. j2| adapted several 
bipartite network flow algorithms such that the running times depend on the number of nodes in 
the smaller side of the input bipartite graph instead of the total number of nodes. Tokuyama and 
Nakano used this property to reduce the time complexity of the minimum cost assignment problem 
[p7| and the Hitchcock transportation problem |26| ]. This paper presents similar improvements for 
maximum weight matching. Specifically, we show that the running time of the matching algorithms 



of Gabow and Tarjan |T0| and Kao et al. [15| can be improved to 0(min{y / nJmlogn s ./V, m + 
n^ 5 log n s N}) and 0(^/n^W), respectively, where n s is the number of nodes in the smaller side of 
the input bipartite graph. 

The weight-unbalanced property is exploited in another way. Given a weight-unbalanced bi- 
partite graph G, let G' be the subgraph of G with its dominating nodes removed. Note that G' 
has a total weight much smaller than G does. Based on the 0(^/nW)-time matching algorithm 
of Kao et al. [15|, finding the maximum weight matching of G' is much faster than finding one of 



G. To take advantage of this fact, we design an efficient algorithm that finds a maximum weight 
matching of G from that of G' . This algorithm is substantially faster than applying directly the 
0{y/nW)-t\m.e matching algorithm on G. 

These results for unbalanced graphs provide a basis for solving a new matching problem called 
the hierarchical bipartite matching problem. This matching problem is at the core of our MAST 
algorithm and is defined as follows. Let T be a rooted tree. Denote r as the root of T. Let C(u) 
denote the set of children of node u. Every node u of T is associated with a positive integer w(u) 
and a weighted bipartite graph G u satisfying the following properties: 

• w{u) > T.v&C{u)W{v). 

• G u = (X U ,Y U , E u ) where X u = C(u). Each edge of G u has a positive integer weight, and 
there is no isolated node. For any node v € X u , the total weight of all the edges incident to 
v is at most w(v). Thus, the total weight of the edges in G u is at most w(u). 

See Figure |] for an example. For any weighted bipartite graph G, let mwm(G) denote a maximum 
weight matching of G. The hierarchical matching problem is to compute mwm(G m ) for all internal 
nodes u of T. Let b = max ne r{min{|X u |, |y n |}} and e = J2ueT \Eu\- The problem can be solved 
by applying directly our results for node-unbalanced graphs; for example, it can be solved in 
OC£ uET Vb\E u \logbw(u)) = 0(Vbelogw(r)) time using the enhanced Gabow- Tarjan algorithm. 
However, this time complexity is not yet satisfactory. When comparing labeled trees, we often 
encounter instances of the hierarchical bipartite matching problem with e being very large; in 
particular, e is asymptotically much greater than w(r). We further improve the running time 
to 0(Vbw(r) + e) by making additional use of our technique for weight-unbalanced graphs and 
exploiting trade-offs between the size of the bipartite graphs involved and their total edge weight. 

The rest of the paper is organized as follows. Section || details our techniques of speeding up 
the existing algorithms for unbalanced graphs. Section || gives an efficient algorithm for solving the 
hierarchical matching problem. Finally, Section [| describes our algorithm for computing maximum 
agreement subtrees. 
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Figure 1: T is an instance of the hierarchical bipartite matching problem. G u is the bipartite graph 
associated with the node u. 



2 Maximum weight matching of unbalanced graphs 

Throughout this section, let G = (X, Y, E) be a weighted bipartite graph with no isolated nodes. 
Let n = \X\ + \Y\, n s = min{|X|, \Y\}, m = \E\, N be the largest edge weight, and W be the total 
edge weight. 

Suppose every edge of G has a positive integer weight. Gabow and Tarjan [10] and Kao et 
al. [ 15 1 gave an 0{^/nm log(niV))-time algorithm and an 0{^/nW)-t\m.e one to compute mwm(G), 
respectively. 



2.1 Matchings of node-unbalanced graphs 

The following theorem speeds up the computation of mwm(G) if G is node-unbalanced. 
Theorem 1. 

1. mwm(G) can be computed in 0(^/n^mlog(n s N)) time. 

2. mwm(G) can be computed in 0(m + n^ 5 log(n s iV)) time. 

3. mwm(G) can be computed in 0(y/n^W) time. 

Proof. Without loss of generality, we assume n s = \X\ < \Y\. The statements are proved as follows. 

Statement 1. For any node v in G, let a(v) be the number of edges incident to v. Suppose Y = 
0/1,2/2, • • • ,ykn s +r} where k > 1, < r < n s , and a(yi) < a(y 2 ) < ••• < a(y kna+r ). We partition 
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Y into Y~ = {yi,...,y r },ii = {y r +i, . . . ,y r +„ s }, . . . , and Y" fc = {y r +(k-i)n s +l, ■ ■ ■ , 2/r+fcnJ- Note 
that except Yo, every set has n s nodes. 

For any Y 7 C Y", denote G(Y') as the subgraph of G induced by all the edges incident to Y' . 
Suppose that Mj is a maximum weight matching of G(YqL)YiU- • -UYj). Let Yj^ = {y \ (x, y) G Mi}. 
Note that a maximum weight matching of G(Yjv/i U Yi+i) is also one of G(Yq U Yj U • • • U Y^+i). 
Therefore, we can compute mwm(G) using the following algorithm: 

• Step 1. Compute a maximum weight matching Mq of G(Yq). 

• Step 2. For £ = 1 to k, 

let y Ml _! = {y | (x,y) G M,_!}; 

compute a maximum weight matching Mi of G(Yjvf i _ 1 U Y-i). 

• Step 3. Return 

The running time is analyzed below. Let a(Y') be the total number of edges in G(Y'). For 1 < 
i < k, a(YM i _ 1 ) < az(Yi), and a(Yjvf,_iUYi) < 2a(Yj). Using the matching algorithm by Gabow and 

Tarjan 0, we can compute MWM(G(Y Mi _ 1 UYj)) in 0{^\Y Mi ^ U Y^|a(Y"j) log(|Y"Af._ 1 UY" i |iV)) time. 

Note that | Yjif i _ 1 1 < n s and | Yj| = n s . Hence, the whole algorithm uses 0(J2i=i v^« a (^i) l°g( n s-^)) 
= 0(y/n^m log (n s N)) time. 

Statement 2. Since we suppose |X| < |Y"|, any matching of G contains at most \X\ = n s edges. 
Thus, for every u G X, we can discard the edges incident to u that are not among the n s heaviest 
ones; the remaining edges must still contain a maximum weight matching of G. Note that we 
can find these nl edges in 0(m) time, and from Statement 1, we can compute mwm(G) from them 
in 0(-y/nJn^ log(n s iV)) time. The total time taken is 0(m + n^ 5 log(n s iV)). 

Statement 3. The algorithm in the proof of Statement 1 can be adapted to find mwm(G) in 
0(y/n^W) time by using the 0{y/nW r )-time matching algorithm of Kao et al. |y| to compute each 
Mi. For any Y' C Y, we redefine a(Y') to be the total weight of edges incident to Y. Then we can 
use the same analysis to show that the adapted algorithm runs in 0(X)j=i \/ns&(Y)) = 0{^/nlW) 
time. □ 

2.2 Matchings of weight-unbalanced graphs 

We show how to speed up the matching algorithm of Kao et al. |0| when the input graph G 
is weight-unbalanced. The key technique is stated in Lemma |2| below. The following example 
illustrates how this lemma can help. Suppose that G has O(l) dominating nodes. Let G' be the 
subgraph of G with the dominating nodes removed. Let W' be the total edge weight of G' . Since 
G is weight-unbalanced, we further assume W = o(W). To compute mwm(G), we can first use 
Theorem 0(3) to compute mwm(G') in 0(y/n^W') time and then use Lemma [2] to compute mwm(G) 
from mwm(G') in 0(m log n s ) time. The total running time is 0(m log n s + y/n^W') = o(^/n^W), 
which is smaller than the running time of using the algorithm of Kao et al. |l5[] to find mwm(G) 
directly. 

Lemma 2. Let H = {x±,X2, ■ ■ ■ ,Xh} be a subset of h nodes of X. Let G — Li be the subgraph 
of G constructed by removing the nodes in H . Denote by E' the set of edges in G — H . Given 
mwm(G - H), we can compute mwm(G) in 0(\E\ + (h 2 \E'\ + /i 3 )logn s ) time. 
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Proof. First, we show that using 0(|.E|) time, we can find a set T of only 0(ram{h\E'\ + h 2 ,n 2 }) 
edges such that T still contains a maximum weight matching of G. In the proof of Theorem ||(2), 
it has already been shown that we can find in 0(|i?|) time a set of 0(n 2 ) edges that contains 
mwm(G). Thus, it suffices to find in 0(|.E|) time another set of 0(h\E'\ + h 2 ) edges that contains 
mwm(G); T is just the smaller of these two sets. Let Y' be the subset of nodes of Y that are 
endpoints of E'. For any Xi E H, we select, among the edges incident to X{, a subset of edges Ei, 
which is the union of the following two sets: 

• {(xuy) | y E Y'Y, 

• {(xi,y) | (xi,y) is among the h heaviest edges with y g" Y'}. 

Observe that E' U E\ U • • ■ U E^ must contain a maximum weight matching of G, and these \E' U 
E 1 U---E h \= 0(h\E'\ + h 2 ) edges can be found in 0(\E\) time. 

By discarding all unnecessary edges (i.e., edges neither in T nor in mwm(G—H)), we can assume 
that G has only 0(mm{h\E'\ + h 2 ,n 2 }) edges, while still containing mwm(G) and mwm(G — H). 
This preprocessing requires an extra Od-E 1 )) time for finding mwm(G). 

Below, we describe a procedure which, given any bipartite graph D and any node x of D, 
finds mwm(D) from mwm(D — {x}) in 0(mi)logm£)) time, where mo is the number edges of D. 
Then, starting from G — H, we can apply this procedure repeatedly h times to find mwm(G) from 
mwm(G — H). Since G is assumed to have only C^minl/ilE 1 '! + h 2 ,n 2 }) edges, this process takes 
0{h((h\E'\ + h 2 )\ogn s )) time. This lemma follows. 

Let M and M x be a maximum weight matching of D and D — {x}, respectively; denote by S 
the set of augmenting paths and cycles formed in MU M x — M D M x , and let a be the augmenting 
path in S starting from x. Note that the augmenting paths and cycles in S — {a} cannot improve 
the matching M x ; otherwise, M x is not a maximum weight matching of D — {x}. Thus, we can 
transform M x to M using a. Note that a is indeed a maximum augmenting path starting from x, 
which can be found in 0(rar> logm^j) time [||]. □ 

3 Hierarchical bipartite matching 

Throughout this section, let T be a rooted tree as defined in the definition of the hierarchical 
bipartite matching problem in §|l|. The root of T is denoted by r. For each node u of T, w(u) 
and G u = (X U ,Y U , E u ) denote the weight and the bipartite graph associated with u, respectively. 
Furthermore, let b = max u£ T {min{|X u |, and e = J2 U ^T l-^ul- 

In this section, we describe an algorithm for computing mwm(G u ) for all u G T in 0(Vbw(r)+e) 
time. Our algorithm is based on two crucial observations. One is that for any value x, there are 
at most w{r)/x graphs with its second maximum edge weight greater than x. The other is that 
most of these graphs have their total weight dominated by edges incident to a few nodes. For those 
graphs with a large second maximum edge weight, we compute their maximum weight matchings 
using a less weight-sensitive algorithm. As there are not many of them, the computation is efficient. 
For the other graphs, their weights are dominated by the edges incident to a few nodes. Thus, 
using Lemma § and a weight-efficient matching algorithm, we can compute the maximum weight 
matchings for these graphs efficiently. Details are as follows. 

Consider any subset B of nodes of T. Let 5 = min ng B w(u). We say that B has a critical degree 
h if for every u G B, u has at most h children with weight at least 5. For any internal node u, 
let secw(u) = 2nd- max{u)(u) | v G C(u)}, i.e., the value of the second largest w(v) over all the 
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children v of u. Lemma || below shows the importance of Secw(u) and critical degrees. Lemma ||(|l]) 
shows that there are not many nodes u with large SECW(it); for those nodes u with small Secw(u), 
they should not have a large critical degree, and Lemma ||(||) states that the maximum weight 
matchings associated with these nodes can be computed efficiently 

Lemma 3. 

1. Let x be any positive number. Let A be the set of nodes u of T with secw(-u) > x. Then 
\A\ < w(r)/x. 

2. Let B be any set of nodes of T . If B has critical degree h, then we can compute mwm(G u ) 
for all u G B in 0((Vb + h 3 logb)w(r) + J2 u <eB \ E u\) time. 

Proof. The statement are proved as follows. 

Statement [|. Let L be the set of nodes u in T such that (1) w(u) > x and (2) either u is a 
leaf or w(v) < x for all children v of u. Since the subtrees rooted at the nodes of L are disjoint, 
w(r) > J2 U £L W (. U ) > x\L\. Thus, \L\ < w(r)/x. Let T' be the tree in T induced by L, i.e., T' 
contains exactly the nodes of L and the least common ancestor of every two nodes of L. Note that 
T' has at most \L\ leaves and at most \L\ internal nodes. On the other hand, every node of A is 
an internal node of T'; thus, \ A\ < \L\ < w(r)/x. 

Statement |2[ For every node u G B, let H{u) be the set of u's children that have a weight at 
least 5 = min ng # w(u) each. Let L{u) be the set of the rest of u's children. Note that \H(u)\ < h 
because B has critical degree h. Since the weight of G u — H(u) is at most J2xeL(u) w(x) and 
b > min{|X n |, ll^l}, by Theorem [l] we can compute mwm(G m — H(u)) in time 

0(VbJ2 xeL{u)W (x) + \E u \). (1) 

Since G u — H(u) has at most J2 X £L(u) w(x) edges and \H{u)\ < h, by Lemma ^, we can compute 
mwm(G u ) from mwm(G u — H(u)) in time 

0(\E U \ + (/i 2 J2 X £L(u) v>i?) + h 3 ) log b) = 0{h* log 6E, e L(,) w ^ + I^D- ( 2 ) 
From Equations ([!]) and we can compute mwm(G u ) for all u G B in time 

O (j2 u eB ((Vb + h 3 log b) E, eL(u ) w(x) + Kl) ) • 

Since the subtrees rooted at some node in \J ueB L(u) are disjoint, J2ugb^2xgL(u) w ( x ) — w ( r )- 
This statement follows. □ 

We are now ready to compute mwm(G u ) for all nodes u of T. We divide all the nodes in T into 
two sets: $ = {u 6 T | secw(u) > 6 3 } and II = {u G T | secw(-u) < b 3 }. 

Every node u G has Secw(u) > b 3 ; by Lemma |||(||), l^l < w(r)/b 3 . Furthermore, by 
Theorem |(2), the time for computing mwm(G u ) for all u G $ is 0(J2 U £$(b 2 ' 5 log(bw(u)) + \E U \)) 
= 0(^b 2 - 5 log w(r) + E ue <K \E U \) = 0(w(r) l -^l + J2 u ^ \E U \). This time complexity is still far 

from our goal as logw;(r) may be much larger than \/^. To improve the time complexity, we first 
note that using the technique for proving Lemma [||, we can compute mwm(G u ) in time depending 
only on secw(u). Then, with a better estimation of Secw(-u), we can reduce the time complexity 
to 0(w(r) + Eue$ l-^ul)- Details are given in Lemma f|(l). 
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For II, we can handle the nodes u G II with w(u) > b 3 easily. For nodes with w(u) < b 3 , we 
apply Lemma [|(|2|) to compute mwm(G u ). The basic idea is to partition the nodes u G IT into a 
constant number of sets according to w(u) such that every set has critical degree &?. This can 
ensure that the total time to compute all the mwm(G„) is 0(ybw(r) + E«en \E u \)- Details are 
given in Lemma §(2). 

Lemma 4. 

1. We can compute mwm(G u ) for all u G $ in 0(w(r) + E«e* \^u\) time. 

2. We can compute mwm(G u ) for all u G LI in 0{\fbw{r) + E«en l-^wl) time. 

Proof. The two statements are proved as follows. 

Statement 1. Observe that for any u£$, G u has at most b 2 edges relevant to the computation 
of mwm(G u ), and they can be found in 0(\E U \) time. Let E' u be this set of edges. Below, we 
assume that, for every u£$, G u has only edges in E' u . Otherwise, it costs 0(E«e$ \E U \) extra 
time to find all E' u and the assumption holds. 

For every k > 1, let = {u G $ | 2 k ~ 1 b 3 < Secw(u) < 2 k b 3 }. Obviously, the nonempty 
sets $fe form a partition of <I>. Below, we show that for any nonempty we can compute 
mwm(G u ) for all u G in O (w{r)k/2 k + Eue<i> fc \E' U \ log 6) time. Thus, the time for computing 

mwm(G u ) for all u G $ is 0(£ fc >i w(r)k/2 k + E ug$ \K\ log b) = 0(w(r) + E«e* Kl log &) = 
0{w(r) + (w(r)/b 3 )b 2 log 6) = 0(«;(r)) ; and Statement 1 follows. 

We now give the details of computing MWM(G n ) for all u G &k- Let u' be the child of u where 
w(u') is the largest over all children of u. Since secw(u) < 2 k b 3 , every edge of G u — {u'} has 
weight at most 2 k b 3 . By Theorem |](2) and Lemma and the fact b > min{\X u \, \ Y U \}, we can 
find mwm(G u ) in 0(Vbb 2 log(62 fc 6 3 ) + \E' U \ log b) time. By Lemma |(|), \$> k \ < Thus, we can 
compute mwm(G«) for all u G in time 

o(E uG<I , fc v^6 2 log(62 fe & 3 ) + |K|log6) = o(2^(fc + log6)+E„ e * fc K|log6) 

= o(«;(r)ty2 fc +E„ 6 * k K|log&). 

Statement 2. We partition LI as follows. Let LI' be the set of nodes in LI with weight greater 
than b 3 . For any < k < 20, let LT^ = {u \ u G II and b f < w(u) < b~}. Obviously, LI = 

n'unoU---n 20 . 

Since Secw(u) < b 3 and w(u) > b 3 for all nodes u in LI', II' has critical degree one. By 
Lemma §((2|), we can compute mwm(G u ) for all nodes in II' using 0(Vbw(r) + E«en' l-^ul) time. 

Each Llfc is handled as follows. For every node u G LT^, u has at most 6? children with 
weight at least b?; otherwise w(u) > b~^~ and u G" Ilk- Thus, 11^ has critical degree br. By 
Lemma ||(|2]), we can compute mwm(G u ) for all u G LT^. in 0((v6 + 67 log fo)io(r) + Euen fe 1^1) 
= 0(ybw(r) + Euen fc l-^ul) time. In summary, we can compute mwm(G m ) for all u G LI in 
0(Vbw(r) + E ue n \E U \) time. □ 

Theorem 5. We can compute mwm(G u ) /or all nodes u G T in 0(Vbw(r) + e) £?me. 

Proof. It follows from Lemma |3] and the fact that T = $ U IT and E«eT l-^ul = e - I— ' 
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4 Computing maximum agreement subtrees 



By generalizing the work of Cole et al. Q on binary evolutionary trees, we can easily derive an 
algorithm to compute a maximum agreement subtree of two labeled trees. There is, however, a 
bottleneck of computing the maximum weight matchings of a large number of bipartite graphs with 
nonconstant degrees. By using our result on the hierarchical bipartite matchings, we can eliminate 
this bottleneck and obtain the fastest known mast algorithm. Section 4J introduces basics of 



labeled trees. Section 4.2 uses our results on the hierarchical bipartite matchings to remove the 



bottleneck in our MAST algorithm. Section 4.2 details our mast algorithm and analyzes its time 



complexity. Section L4 discusses the generalization of the work of Cole et al. H . 

Throughout this section, T\ and T<i denote two labeled tress with n nodes and of degree d>2. 
Let AT lt T 2 = X^eTi J2 V £T 2 S( u > v ) where 5(u,v) = 1 if nodes u and v are labeled with the same 
symbol, and otherwise. Also, let A denote At 1; t 2 - 



4.1 Basics 

For a rooted tree T and any node u of T, let T u denote the subtree of T that is rooted at u. For 
any set L of symbols, the restricted subtree of T with respect to L, denoted by T\\L, is the subtree 
of T (1) whose nodes are the nodes with labels from L and the least common ancestors of any 
two nodes with labels from L and (2) whose edges preserve the ancestor-descendant relationship of 
T. Figure |2] gives an example. Note that T\\L may contain nodes with labels outside L. For any 
labeled tree T', let T\\T' denote the restricted subtree of T with respect to the set of symbols used 
in V. 

A centroid path decomposition jij of a rooted tree T is a partition of its nodes into disjoint 
paths as follows. For each internal node u in T, let C{u) denote the set of children of u. Among 
the children of u, one is chosen as the heavy child, denoted by hvy(u), if the subtree of T rooted at 
hvy(u) contains the largest number of nodes; the other children of u are side children. We call the 
edge from u to its heavy child a heavy edge. A centroid path is a maximal path formed by heavy 
edges; the root centroid path is the centroid path that contains the root of T. See Figure || for an 
example. 

Let T>(T) denote the set of the centroid paths of T. Note that T>(T) can be constructed in 
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Figure 3: A centroid path decomposition of a rooted tree. 



0(|T|) time. For every P £ T>(T), the root of P, denoted r(P), refers to the node on P that is the 
closest to the root of T, and A(P) denotes the set of the side children of the nodes on P. For any 
node u on P, a subtree rooted at some side child of u is called a side tree of u, as well as a side 
tree of P. Let side-tree(P) be the set of side trees of P. Note that for every R £ side-tree (P), 
\R\ < \T r(p ^\/2. 

The following lemma states two useful properties of the centroid path decomposition. 
Lemma 6. Let T\ and T2 be two labeled trees. 

!■ Ep 6 o(T!) A t; (p) ,t 2 < A T l5 T 2 logn. 

2. £ Pe77(Tl) V min ( d ' \ T i (P) \) A t^\t 2 ^ VdA TltT2 log%. 

Proof. The two statements are proved as follows. 

Statement 1. A centroid path P is attached to another centroid path P' if the root of P is the 
child of a node on P' . We define the level of a centroid path as follows. The root centroid path has 
level zero. A centroid path has level i if it is attached to some centroid path with level i — Note 
that any subtree attached to a centroid path with level i has size at most n/2 l+1 . Thus, there are 
at most logn different levels. Moreover, subtrees attached to centroid paths with the same level 
are all disjoint. 

For any < i < logn, denote by Di the set of all centroid paths in T>{T\) with level i. Then 
Epec(Ti) \r( P ) j, = Eo<i<io g nEpeD; \r( P ) „ < log nJ^peD, \r( P ) < A Ti ,t 2 logn. 

X 1 Z \ 1 Z 1 1 z 

Statement 2. We divide the centroid paths into 2 groups. We first consider the centroid paths 
on level i where < i < log For any such i, 

£ ^min{d,|7T (p) |}A rr( P )T2 < ]T y/dA^p) Ta < VdA Tl , T2 . 

Thus, Eo<*<io g ^ EpgA y/mm{d,\^ lP) \}A^ (P)T2 < VdA TuT2 log f . 

Next, we consider the centroid paths on level log ^ + i where i > 0. Note that for a path P on 
level log f + i, |T[ (P) | < d/2 i+1 . Thus, E l >o Ep £ D log v / ™i n K S^ A t^\t 2 is at most 

^ E yJd/^A TriP) T2 < E yJd/2^A TuT2 = 0(VdA TuT2 ). 

□ 
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The following notion captures which pairs of nodes of two labeled trees T\ and T2 are important. 
Consider any centroid paths P G T>{T\) and Q G T>{T2). For any node x G P or Q, let £( x ) be 
the set of symbols labeling x and the nodes in the side trees of x. Let inp(P, Q) be the set of node 
pairs (u,v) e P x Q with C(u) D C(v) / 0. Let inp(P, T 2 ) = Uq6©(t 2 ) inp(P, Q). 

4.2 Matchings 

As explained later in §[D|, we can easily generalize the dynamic programming approach in Q to 
compute MAST(Ti, T2) for any two labeled trees T\ and T2, but there is a bottleneck of computing 
the maximum weight matchings of a large number of bipartite graphs with nonconstant degrees. 
This section uses our results on hierarchical bipartite matchings to remove this bottleneck. 

First of all, we identify the bipartite graphs for which maximum weight matchings are required. 
For any nodes u G T\ and v G T2, define G uv as the weighted bipartite graph between C(u) and 
C(v) where edge (x, y) has weight MAST(Tf , Tj 7 ). Furthermore, define H uv as the graph constructed 
from G uv by removing all the zero- weight edges and all the edges adjacent to the heavy child of u 
or v. Note that the total edge weight of H uv can be significantly smaller than that of G uv . Yet by 
Lemma ^, we can recover mwm(G to ) from mwm(H uv ) efficiently. 

This section shows that for any centroid path P G D(Ti), we can efficiently compute mwm(H uv ) 
for all (u, v) G inp(P, T2). More precisely, let TA4p denote the required time; the key result of this 
section is that SpgD(Ti) TMp < \/dAlog^ (see Lemma |8|). 

To derive an upper bound on TMp, we need an estimate of the number of edges in the graphs 
H uv for all (u,v) G INP(P, T2). For any centroid path P G D(Ti), let tnoe(P) be the total number 
of edges in the graphs H uv for all (u,v) G INP(P, T2). Furthermore, let tnoe = J2pev(Ti) tnoe(P). 

Lemma 7. 

1. For any P G Pffi), tnoe(P) = O (e^p) I^HTf | log ^gjff 

2. tnoe = O(Alogn). 

Proof. The two statements are proved as follows. 

Statement 1. Let r be the root of P. By definition, every edge in H uv for any (u,v) G 
inp(P, T2) corresponds to a pair of side trees (T, II) where T G side-tree(P) and II G side-tree(Q) 
for some Q G V(T2\\T[) such that T and II contain some common labels. We call (T,II) an 
intersecting side tree pair. Thus, tnoe(P) is at most the total number of intersecting side tree pairs 
in side-tree(P) x U{side-tree(Q) | Q G V(T 2 \\T[)}. 

To simplify our discussion, let R = T2IIP1 arid side-tree(P) = |J{side-tree((5) | Q G V(R)}. 
Consider any node w G A(P). Tf is a side tree in side-tree (P). Let R w be T2HT™. Note that 
each path in R w starting from a node x to its descendant y corresponds to a simple path Q xy in 
R from x to y. Let l st (x,y) be the node on Q xy which is the child of x. By the definition of side 
trees, among all the side trees in side-tree(P), at most log \R lst ^ x,y ^\ + 1 have roots on Q xy . 

For all side trees R v G side-tree (R), {T™,R v ) is an intersecting side tree pair if and only 
if either (1) v is a node on the path from the root of R to the root R w \ or (2) v is a node on 
some path Q xy on R where (x,y) is an edge in R w . The number of side trees R v G side-tree(P) 
in case (1) is less than log|P|. The number of side trees R v G side-tree(P) in case (2) is less 
than J2(x, y )€R w (l°g \R 18t( - x ' y ^\ + 1). Let SUm(P ui ) denote J2(x, y )eR w log |P 1S *^'^|. Below we prove 
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S\jm(R w ) = O (\R W \ log p^yY I n total, tnoe(P) = O (j2 w <=A(P) l°g jp^j})' as claimed in this 

statement. 

It remains to prove SVM(R W ) = O (\R W \ log y^yj • For any leaf y of R, let be the maximal 
path in R ending at y such that every node on p(y) has at most one child; denote r p {y) as 
the root of p{y). Let Z Rvj = {p{y) \ y is a leaf of R w }. As {i^M^) | p ( y ) e is a 

set of disjoint subtrees of P, T, p ( y ) e z Rw \R 1St{rp {y) ' y) \ < Note that |Z fl J < \R W \. Thus, 

Ep^ez^ lo g l-R 1 **^'^! < \Rw\ log Let R w be the tree obtained by removing all the paths 

in Zr w . We have S\jm{R w ) = \ R w | log r^r-y + sum(P<„). Note that i? w contains at most half the 

leaves of R w . Hence, SUm(P„,) = O (\R W \ logy^y). 

Statement 2. By Statement 1, tnoe is in the order of 



< 



< 



< 



< 



E 

PeV{T!) 

E 

Pe2?(Ti) 

E 

PeX>(Ti) lw&A(P) 



E i^iixriiog 

weA(P) 

E A ^,T 2 log 

w&A(P) 



2|r 2 ||T 1 r(P) | 
\T 2 \\Tf | 

2|T 2 ||r[ (P) |" 
log|T 2 ||Tf | 



A TfiT2 (l + log |T 2 ||T[ (P) | - log |r 2 ||7? 



E 

PeK(Ti) 



A K p )jT2 + A ^(j.) 



log|T 2 ||T{ 



r(P)i 



2 A Tr)T2 log|T 2 ||Tf 

u>6,A(P) 



A T r(P) m + 



',T 2 



E 

PeK(Ti) 



E 

Pex>(Ti) 

E 

Pex>(Ti) 

E 

Pe2?(Tij 

< A Tl jTa log |T 2 | + A Tl iT2 log | T 2 1 
= 2A log n. 



A^. (J .) log|T 2 ||^ 



r(P), 



T 



J2 A rr , T2 iog|r 2 ||rr 



A t: (p ))T2 



A t ,(p ))T2 



+ A T ro T2 log l^llrj""!, where r Q is the root of T\ 
+ A riiT2 log|T 2 | 

by Lemma |^(1) 



□ 



We proceed to detail the computing of mwm{H uv ) for all (u, v) £ UpeD(Ti) INP(P, T 2 ). A 
bipartite graph is nontrivial if both node sets have at least two nodes. Computing MWM(lf OT ) for 
all trivial H uv takes only linear time, i.e., O(tnoe) = O(Alogn) time. Thus, we focus on those 
nontrivial H uv . 

Consider any centroid path P in T>(Ti) and fix a node u of P. Let 7i u be the set of all nontrivial 
graphs H uv where (u, v) £ inp(P, T 2 ). Let TA4 U be the time for finding mwm(H uv ) for all the graphs 
in 7i u . Let tnoe(ti) be the number of edges of all the graphs in 7i u . In the next lemma, we first 



derive an upper bound of TA4 U , and then we show EpeD(Ti) TAip = 0(VdAlog 



2n\ 
d 



lr This follows from the fact that for any sequence of positive numbers eti, 02, ...,<!* with the sum equal to s, 
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Lemma 8. 



1. TM-u = 0(ymin(ci, \T^\)As Uj t 2 + tnoe(u)), where S u is the set of side trees of u in T\ and 
A S U ,T 2 = Exes u A T,T 2 • 

Proof. The two statements are proved as follows. 

Statement 1. Let B u be the set of labels used in the side trees in S u . First, we show that for all 
v G T2\\B U , mwm(H uv ) can be computed in o(<Jmm(d, \T£"\)As ut T 2 + tnoe(u)^ time. Second, we 
recover mwm(H uv ) for all nontrivial H uv where (u,v) G INP(P, T%) in 0{A$ U) t 2 ) time. Then this 
statement follows. 

To compute mwm(H uv ) for all v G T2\\B U , we apply the hierarchical bipartite matching al- 
gorithm of §||. Let T = T2\\B U . For every node v G T, we associate with v the bipartite graph 
H U v and let w(v) = A SujT *. Observe that w(v) = A Su , T - > ExeC(t>) A s u ,T* = ExeC(v) w ( x )- In 
addition, for every node x £ C(v), the total weight of all the edges incident to x in H uv is at most 
w{x) = A$ u ,t x - Hence, T and the associated bipartite graphs H uv satisfy the conditions for the 
hierarchical bipartite matching problem. For the time complexity, note that, for every v G T, the 
two node sets of H uv have size bounded by d and min(<i, |T"|), respectively. Thus, by Theorem ||, 



we can find mwm(H uv ) for all nodes v of T = T2\\B U in 0(Jram(d, \T^\)As Ut T2 + tnoe(it)) time. 



Next, we show how to recover mwm(H uv ) for all v G L, where L denotes the set of nodes v of 
T2 such that (u, v) G INp(P, T2) and H uv is nontrivial. Note that every node x of T2||5 U is also a 
node in T2 and every edge (x,y) of T2||B M corresponds to a path in T2. Also observe that every 
node v G L must be a node in T2\\B U ; otherwise, v lies on a path corresponding to an edge (x,y) 
of I^ll-Bu, and H uv contains a singleton node set and is trivial because v G" L. Therefore, we can 
compute mwm(H uv ) for all v G L by traversing T2\\B U once using 0(|(T2||5 U )|) = 0(As u ,t 2 ) time. 

Statement 2. By Statement 1, 






by Lemma ^ 




by Lemma |^ 



□ 
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4.3 The MAST algorithm 

Our algorithm is based on the following recurrence, which generalizes the one given in ^ to handle 
labeled trees. 



mast(T?,1%) =max 



max^AST^Tf ) | x G C(v)}, 
nwx{MAST(3f ,2^) | x G C(u)}, 
||mwm(G uv )|| if u and v are both unlabeled, 
||MWM(Cr wv )|| + 1 for u, v labeled with the same symbol, 



(3) 



where ||mwm(G w )|| denotes the total weight of the matching. 

Equation (||) suggests a bottom-up dynamic programming approach to computing mast(Ti, T2). 
The following lemma generalizes the technique of Cole et al. [Q] for speeding up the dynamic 
programming. Basically, it states that the time complexity is dominated by the time for finding 
maximum weight matchings of some graphs H uv . 



Lemma 9. Let P G V(T%) be a centroid path and r = r(P). Given the values mast(T" , (r 2 ||T 1 u ) t ') 
for all nodes u G A(P) and v G T2IIT", we can compute MAST(r{", (T2\\T[) V ) for all v G T^HTf in 
O ((l(T[) -EueA(P)l(Tl) + A TT,T 2 ) log d + TMp) time, where j{R) denotes Ar } t 2 log | (T 2 \\R)\. 

Cole et al. § proved Lemma ^ for the special case where T\ and T2 are binary evolutionary 
trees. For a better flow of discussion, we postpone the proof of Lemma [9| to §4.4. Here, Lemma || 
immediately suggests that mast(Ti,T2) can be computed in a bottom-up fashion as follows: 

• Step 1. Let -< denote the ordering on T>(T) where P\ -< P2 if the root of P\ is a descendant 
of the root P2. 

• Step 2. For every P G T>(T\) in increasing order according to -<, let r denote the root of P; 
apply Lemma || to find (Tf, (T2||Tf) 1 ') for every node v G T2||T{\ 

The above algorithm at the end computes mast(T^ Po \ {T2\\T\ <yPo ' > )), where P a is root centroid 



nT(P„) 



T x and mast(T[ {Po) , (T 2 ||T[ {Po) )) 



path of T\. Since r(P Q ) is also the root of T, we have 1\ 
MASt(Ti, T2). As stated in the following lemma, the running time is dominated by the time for 
computing the maximum weight matchings, i.e., YlpeVtTi) TM.p. 

Lemma 10. We can compute mast(Ti,T2) in 0{A\ogn\ogd+Y^,peD{T 1 ) TMp) time, where A = 

Proof. To derive the time for computing mast(7i, T 2 ), we simply sum the time bound stated in 
Lemma H over all centroid paths of T\ . Observe that 



E 

Pef(Ti) 



E iv?) 

ueA(P) 



7(T 1 r °), where r Q is the root of of T\ 

A TuT2 log I T 2 1 1 Ti| 
A TliTa log |T 2 | 
A log n. 



Thus, we can compute mast(Ti , T%) in 0(A log n log d+J2 Pe£>(Ti) TM. p +J2 Pe£>(Ti) ^-q 
time. By Lemma ||, X)pez>(Ti) ^ T r(p) T ^ < Alogn. Thus, this lemma follows. 



r(P) 



T 2 



logd) 

□ 
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Theorem 11. mast(7i,T2) can be computed in 0(v / dAr 1 ,T 2 log ^) time. 

Proof. By Lemma §, Spex>(Ti) ^Mp = O(VdAlog^). Thus, by Lemma [l0|, mast(Ti,T 2 ) can 
be computed in 0(A(lognlog<i + ^/dlog^j)) time. Since lognlogd < Vdlog^, this theorem 
follows. □ 

4.4 Proof of Lemma ^ 

This section provides the details for adapting the techniques of Cole et al. H to prove Lemma ||. 
Consider any centroid path P G T>(T\). Let r be the root of P. Lemma || states that if we are 
given, for every u G A(P), 

MAST(Tf, {T 2 \\Tf) v ) for all v G T 2 \\T^, 

then we can compute 

MAST(Tf , (T 2 \\T[) V ) for all u G T 2 ||T[ (4) 

in O (( 7 (Tf) - E ue ^(P) 7W) + A rr ,T 2 ) log d + TMp^j time. 

The centroid paths in D(T2||Tj r ) partition the set of nodes of T2||T{" and define an ordering on 
the nodes of T2||Tf. Precisely, the set of values in Equation @ are partitioned into the following 
sets 

{mast(T[ , {T 2 \\T{) V ) | v G Q} where Q G V(T 2 \\T{). 

We focus on computing {mast(T[ , (T 2 \\T[) V ) \ v G Q} for each Q G £>(T 2 ||T[). Cole ei aZ. g 
dealt with the special case where T\ and T 2 are binary evolutionary trees. They introduced the 
maximum agreement matching (MAM) problem and showed that {mast(T{", (T 2 \\T[) v ) \ v G Q} 
can be computed by solving the MAM problem on some weighted bipartite multigraph. In |2^| , 
Przytycka observed that this technique can be generalized to evolutionary trees of arbitrary degrees; 
basically, it suffices to use a more complicated bipartite multigraph. We observe that this can be 
further generalized to labeled trees with arbitrary degrees by adding more edges to the multigraph. 

In the rest of this section, we define the maximum agreement matching problem and the 
weighted bipartite multigraph Qpq for handling labeled trees with general degrees. 

The maximum agreement matching problem. Let Q = (X, Y, E) be a weighted bipartite 
multigraph. Suppose that X = {ui,u 2 , . . . , u p }, Y = {v±, v 2 , . . . , v q }, and every pair of nodes are 
connected by at most four edges. Every edge is colored by either gray, green, red, or white. We say 
that edge (ui,Vj) is below edge (uk,vi) if i < k and j < l\ and that (ui,Vj) crosses (uk,vi) if i < k 
and j > I. A matching of Q is an agreement matching if it satisfies all the following properties: 

• No white edge crosses another white edge. 

• There is at most one gray edge. If a gray edge is present, it must be below all the white 
edges. 

• There are at most one pair of red and green edges. If such a pair is present, then this pair of 
edges are below all white edges, and the red edge crosses the green edge. 

• A gray edge cannot coexist with a pair of red and green edges. 
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The weight of an agreement matching of Q is the total weight of the edges in the matching. A 
maximum agreement matching is one with the maximum weight, and we denote this weight as 
mam(<7). 

For any nodes U{ £ X and Vj G Y, let Q(ui,Vj) denote the subgraph of Q induced by the 
nodes Ui, Uj+i, . . . ,u p and Vj,Vj+x, . . . ,v q . The maximum agreement matching problem asks for 
MAM(Q(ui,Vj)) for all pairs of (v,i,Vj) such that either (1) Ui = u\ and vj is adjacent to some edges 
of Q; or (2) Vj = v\ and Ui is adjacent to some edges of Q. 

The weighted bipartite multigraph Qpq. Roughly speaking, Gpq is constructed by adding 
suitable colored edges between P and Q. Our aim is that by solving the MAM problem on Gpq, 
all the values in Equation (Q) are found automatically. 

First, we define a new graph H' uv from G uv and H uv as follows. H' uv has all the edges of H uv , 
as well as some other edges from G uv . Among all the edges of G uv adjacent to hvy(it), we add into 
H' uv those edges (hvy(u), y) where y is adjacent to some edges of H uv . Among the rest of the edges 
adjacent to hvy(u), we add into H' uv the one with the heaviest weight. Similarly, among all edges 
adjacent to hvy(u), we choose some edges to add into H' uv . 

We are now ready to define Gpq- Suppose that P = (u%,U2, • • • , u p ) and Q = (v\, V2, ■ ■ • , v q ). 
There is one or more edges between nodes U{ and Vj if and only if (m,Vj) G INP(P, Q). The 
number, color, and weight of edges between m and Vj are determined in the three cases below. Let 
MAXp = max{MAST(T"% T) | T is a side tree of Vj}. Let max^ = max{MAST(r, T^) \ T is a side 
tree of Ui}. 

Case 1: Uj and Vj are both unlabeled. There are a white edge, a gray edge, a green edge and 
a red edge connecting U{ and Vj, with weights ||MWM(iJ UiUj )||, ||mwm(H^. v .)||, max^, and max^, 
respectively. 

Case 2: m and Vj are labeled by the same symbol z. There are a white edge and a gray edge 
connecting them. The weight of the white edge is ||MWM(.H ttjt) .)|| + fJ>(z). The weight of the gray 
edge equals the maximum of HMWMfJS^,, )[| +fi(z), MAXp, and MAX^. 

Case 3: either Uj and Vj are labeled by different symbols, or only one of them is labeled. There 
is only one gray edge connecting them. Its weight equals the larger of MAX# and MAX^. 

Note that when the input is evolutionary trees, Gpq is reduced to the multigraph defined in 
p2[ , in which most of the edges are from Case 1, and there are edges (ui,Vj) from Cases 2 and 
3 only when U{ and Vj are leaves. For labeled trees, we simply add extra edges in Cases 2 and 3 
when Ui or Vj is a labeled internal node. By construction, we have the following fact. 

Fact 12. For any m G P and vj G Q, mam(£/pq(uj, vj)) = MAST(T"% (T 2 ||Tf ) v '). Thus, solving 
the MAM problem on Gpq gives mast(T{', (T2\\T[) v ) for all v G Q. 

Using the techniques of Cole et al. j|, ^2|, we can construct Gpq for all Q G V{T2\\T[) and 
solve the corresponding MAM problems in o( ^(Tf) — J2ueA(P) li^i) + ^Tf,T 2 ) logd + TMp^J 
total time. Therefore, Lemma [9] follows. 
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